NMRDPP: A System for Decision-Theoretic Planning with Non-Markovian Rewards

نویسندگان

  • Charles Gretton
  • David Price
  • Sylvie Thiébaux
چکیده

This paper examines a number of solution methods for decision processes with non-Markovian rewards (NMRDPs). They all exploit a temporal logic specification of the reward function to automatically translate the NMRDP into an equivalent Markov decision process (MDP) amenable to well-known MDP solution methods. They differ however in the representation of the target MDP and the class of MDP solution methods to which they are suited. As a result, they adopt different temporal logics and different translations. Unfortunately, no implementation of these methods nor experimental let alone comparative results have ever been reported. This paper is the first step towards filling this gap. We describe an integrated system for solving NMRDPs which implements these methods and several variants under a common interface; we use it to compare the various approaches and identify certain problem features favouring one over the other.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Decision-Theoretic Planning with non-Markovian Rewards

A decision process in which rewards depend on history rather than merely on the current state is called a decision process with non-Markovian rewards (NMRDP). In decisiontheoretic planning, where many desirable behaviours are more naturally expressed as properties of execution sequences rather than as properties of states, NMRDPs form a more natural model than the commonly adopted fully Markovi...

متن کامل

Fahiem Bacchus

Markov decision processes (MDPs) are a very popular tool for decision theoretic planning (DTP), partly because of the welldeveloped, expressive theory that includes effective solution techniques. But the Markov assumption-that dynamics and rewards depend on the current state only, and not on historyis often inappropriate. This is especially true of rewards: we frequently wish to associate rewar...

متن کامل

Rewarding Behaviors

Markov decision processes (MDPs) are a very popular tool for decision theoretic planning (DTP), partly because of the welldeveloped, expressive theory that includes effective solution techniques. But the Markov assumption—that dynamics and rewards depend on the current state only, and not on history— is often inappropriate. This is especially true of rewards: we frequently wish to associate rew...

متن کامل

A Model-Checking Approach to Decision-Theoretic Planning with Non-Markovian Rewards

A popular approach to solving a decision process with non-Markovian rewards (NMRDP) is to exploit a compact representation of the reward function to automatically translate the NMRDP into an equivalent Markov decision process (MDP) amenable to our favorite MDP solution method. The contribution of this paper is a representation of non-Markovian reward functions and a translation into MDP aimed a...

متن کامل

Structured Sohtion Methods for

Markov Decision Processes (MDPs), currently a popular method for modeling and solving decision theoretic planning problems, are limited by the Markovian assumption: rewards and dynamics depend on the current state only, and not on previous history. Non-Markovian decision processes (NMDPs) can also be defined, but then the more tractable solution techniques developed for MDP’s cannot be directly...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003